Mining Rare Associations between Biological Ontologies
نویسندگان
چکیده
The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations.
منابع مشابه
Generalized Association Rules for Connecting Biological Ontologies
The constantly increasing volume and complexity of available biological data requires new methods for managing and analyzing them. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining generalized ass...
متن کاملCorrection: Mining Rare Associations between Biological Ontologies
There is an error in the first equation. Please see the corrected Equation 1 here. SupExp~ 1)p a à p b , if a,b [ roots 2) p ^ a ab à p a p ^ a a , if a [ = roots and b [ roots 3) p a^ b b à p b p , if a [ roots and b [ = roots 4) p ^ a a ^ b b à p a p ^ a a à p b p ^ b b , otherwise Copyright: ß 2014 The PLOS ONE Staff. This is an open-access article distributed under the terms of the Creative...
متن کاملLinking rare and common disease: mapping clinical disease-phenotypes to ontologies in therapeutic target validation
BACKGROUND The Centre for Therapeutic Target Validation (CTTV - https://www.targetvalidation.org/) was established to generate therapeutic target evidence from genome-scale experiments and analyses. CTTV aims to support the validity of therapeutic targets by integrating existing and newly-generated data. Data integration has been achieved in some resources by mapping metadata such as disease an...
متن کاملDiscovery of Ontologies for Learning Resources Using Word-based Clustering
Educational intermediaries are information systems that support the exchange of learning resources among dispersed users. The selection of the appropriate learning resources that cover specific educational needs requires a concise interaction between the user and system. This paper describes a data mining process for the discovery of ontologies from learning resources repositories. Ontologies e...
متن کاملEfficient Rare Association Rule Mining Algorithm
Data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data stored in repositories, corporate databases, and data warehouses. In Data mining field, the primary task is to mine frequent item sets from a transaction database using Association Rule Mining (ARM).Whereas the extraction of frequent patterns has focused the majo...
متن کامل